Towards Solving Large-Scale POMDP Problems Via Spatio-Temporal Belief State Clustering

نویسندگان

  • Xin Li
  • William K. Cheung
  • Jiming Liu
چکیده

Markov decision process (MDP) is commonly used to model a stochastic environment for supporting optimal decision making. However, solving a large-scale MDP problem under the partially observable condition (also called POMDP) is known to be computationally intractable. Belief compression by reducing belief state dimension has recently been shown to be an effective way for making the problem tractable. With the conjecture that temporally close belief states should possess a low intrinsic degree of freedom due to problem regularity, this paper proposes to cluster the belief states based on a criterion function measuring the belief states spatial and temporal differences. Further reduction of the belief state dimension can then result in a more efficient POMDP solver. The proposed method has been tested using a synthesized navigation problem (Hallway2) and empirically shown to be a promising direction towards solving largescale POMDP problems. Some future research directions are also included.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

POMDP Compression and Decomposition via Belief State Analysis

Partially observable Markov decision process (POMDP) is a commonly adopted mathematical framework for solving planning problems in stochastic environments. However, computing the optimal policy of POMDP for large-scale problems is known to be intractable, where the high dimensionality of the underlying belief state space is one of the major causes. Our research focuses on studying two different...

متن کامل

On the Linear Belief Compression of POMDPs: A re-examination of current methods

Belief compression improves the tractability of large-scale partially observable Markov decision processes (POMDPs) by finding projections from high-dimensional belief space onto low-dimensional approximations, where solving to obtain action selection policies requires fewer computations. This paper develops a unified theoretical framework to analyse three existing linear belief compression app...

متن کامل

Finding Approximate POMDP solutions Through Belief Compression

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the ful...

متن کامل

On modeling of large-scale environments for solving spatio- temporal planning problems

the paper introduces a cognitively motivated approach for structuring and representing of unfamiliar large-scale environments. The proposed regionbased representation facilitates collaborative spatio-temporal planning, where problem solving process is shared between a user and an assistance system. The problem domain is structured hierarchically into regions resembling human decision space. The...

متن کامل

Anytime Point-Based Approximations for Large POMDPs

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005